Proposal for Pluggable Optimizer Interface ========================================== Overview ======== We have presented at PGCon 2007 in Ottawa couple of other approaches and algorithms that can be used for query optimization, see http://www.pgcon.org/2008/papers/Execution_Plan_Optimization_Techniques_Julius_Stroffek.pdf We have focused on algorithms for searching the space of all possible orders of joins including bushy trees. The possible algorithms include * Dynamic Programming (already implemented in PostgreSQL) * Genetic Algorithm (already implemented in PostgreSQL) * Dijkstra's Algorithm * A* Search Algorithm * Greedy “Nearest Neighbor” Algorithm * Hill-Climbing * Simulated Annealing * Iterative Improvement * Two Phase Optimization * Toured Simulated Annealing Choosing the best algorithm from the above list is difficult. We have to consider the length of the optimizer computation vs. the quality of the solution which we would like to achieve. We may want to do some hybrid optimization – run couple of the algorithms from the above and choose the best solution found. We might run some very fast algorithm at the beginning and depending on the solution cost we may decide whether it is worthwhile to try to optimize the plan even more (using other algorithm with longer running time but producing better solutions). Therefore we would like to propose an interface which can be used to switch between different optimizer algorithms and/or allow a user to write and use his own implementation. It would allow the community to ship more optimizer algorithms as contrib modules and users may then decide which of those algorithms should be used for their queries. Creating an optimizer ===================== We would propose to create a catalog holding the available optimizers in the system called pg_optimizer. Users could than use a SET command to switch between different optimizers. postgres=# select * from pg_optimizer; optname | optproc ---------+------------------- geqo | geqo_optimizer dynamic | dynamic_optimizer greedy | greedy_optimizer (4 rows) postgres=# set optimizer=greedy; SET Optimizer Invocation Point ========================== There is a code in function make_rel_from_joinlist which decides whether to invoke dynamic programming or genetic algorithm for query optimization. We would propose to place the invocation of the plugged optimizer to the same place and with the same parameters as function geqo and make_one_rel_by_joins are currently invoked. Creating and dropping an optimizer ================================== The optimizer function have to be implemented as a C-Language Function using “Version 1 Calling Conventions”. The return type of the function is RelOptInfo * and the arguments passed to the function are 1.PlannerInfo *root 2.int levels_needed 3.List * initial_rels The proper “CREATE FUNCTION” statement have to be used to create the optimizer function. > CREATE FUNCTION greedyoptimizer(internal, int, internal) > RETURNS internal > AS 'mylib', 'greedy_optimizer' > LANGUAGE C > ; Once, the optimizer function is created user may create an optimizer using the function with the statement > CREATE OPTIMIZER greedy ( > function = greedyoptimizer > comment = 'Greedy Nearest Neighbor Optimizer' > ); If the user decides not to use the optimizer anymore he can invoke > DROP OPTIMIZER greedy; User have to also drop the optimizer function with > DROP FUNCTION greedyoptimizer; Project TODO List ================= 1.Create a pg_optimizer catalog to hold available optimizers. 2.Create wrappers above the current dynamic programming and genetic algorithm optimizers to be used to call those implementations. 3.Modify the parser and add the functions to handle and execute the CREATE/DROP OPTIMIZER statements. 4.Modify GUC that it would be possible to switch between optimizers. 5.Change the code at the optimizer invocation point that the appropriate optimizer function would be called. 6.Handle object dependencies – make an entry in pg_depend that optimizer depends on its function. 7.Implement '\dO' command that will list the available optimizers. 8.Create a contrib module and ship some other optimizer algorithms. 9.Any other suggestion, comments and changes that will come out from the review of this proposal. Things to Decide ================ 1.Rights. Who can create/drop optimizers? Who can use the optimizer? Are the rights for optimizer function enough? Probably, creating and dropping optimizer could be allowed just for DB administrators and rights for optimizer functions are enough to restrict the usage. No need to introduce “GRANT ON OPTIMIZER”. 2.How to behave when dropping an optimizer which is currently running or is set up in other sessions. 3.Any other suggestion, comments and changes that will come out from the review of this proposal. Future Possible Improvements ============================ * Implement more algorithms and implement also more complex combinations of the algorithms. * Create even more optimizer invocation points. Different implementations might be invoked in different places of the current code. If this would make sense and it will come up as a requirement for some optimizer algorithm. * Do some work on “benchmarking” the algorithms depending on the cost of the plan found, running time, etc. * Consider some heuristic for choosing the different optimizers for different queries.